target parameter
When AI Bends Metal: AI-Assisted Optimization of Design Parameters in Sheet Metal Forming
Tarraf, Ahmad, Kassem-Manthey, Koutaiba, Mohammadi, Seyed Ali, Martin, Philipp, Moj, Lukas, Burak, Semih, Park, Enju, Terboven, Christian, Wolf, Felix
Numerical simulations have revolutionized the industrial design process by reducing prototyping costs, design iterations, and enabling product engineers to explore the design space more efficiently. However, the growing scale of simulations demands substantial expert knowledge, computational resources, and time. A key challenge is identifying input parameters that yield optimal results, as iterative simulations are costly and can have a large environmental impact. This paper presents an AI-assisted workflow that reduces expert involvement in parameter optimization through the use of Bayesian optimization. Furthermore, we present an active learning variant of the approach, assisting the expert if desired. A deep learning model provides an initial parameter estimate, from which the optimization cycle iteratively refines the design until a termination condition (e.g., energy budget or iteration limit) is met. We demonstrate our approach, based on a sheet metal forming process, and show how it enables us to accelerate the exploration of the design space while reducing the need for expert involvement.
- North America > United States (0.14)
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (5 more...)
- Asia > Vietnam (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (2 more...)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.46)
- Asia > Vietnam (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (2 more...)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.46)
Using Imperfect Synthetic Data in Downstream Inference Tasks
Byun, Yewon, Gupta, Shantanu, Lipton, Zachary C., Childers, Rachel Leah, Wilder, Bryan
Predictions and generations from large language models are increasingly being explored as an aid to computational social science and human subject research in limited data regimes. While previous technical work has explored the potential to use model-predicted labels for unlabeled data in a principled manner, there is increasing interest in using large language models to generate entirely new synthetic samples (also termed as synthetic simulations), such as in responses to surveys. However, it is not immediately clear by what means practitioners can combine such data with real data and yet produce statistically valid conclusions upon them. In this work, we introduce a new estimator based on generalized method of moments, providing a hyperparameter-free solution with strong theoretical guarantees to address the challenge at hand. Surprisingly, we find that interactions between the moment residuals of synthetic data and those of real data can improve estimates of the target parameter. We empirically validate the finite-sample performance of our estimator across different regression tasks in computational social science applications, demonstrating large empirical gains.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Jordan (0.04)
Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis)
This thesis contains a series of independent contributions to statistics, unified by a model-free perspective. The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning. Mathematical insights are obtained from concrete examples, and these insights are generalized to principles that permeate the rest of the thesis. The second chapter studies the concept of local independence, which describes whether the evolution of one stochastic process is directly influenced by another. To test local independence, we define a model-free parameter called the Local Covariance Measure (LCM). We formulate an estimator for the LCM, from which a test of local independence is proposed. We discuss how the size and power of the proposed test can be controlled uniformly and investigate the test in a simulation study. The third chapter focuses on covariate adjustment, a method used to estimate the effect of a treatment by accounting for observed confounding. We formulate a general framework that facilitates adjustment for any subset of covariate information. We identify the optimal covariate information for adjustment and, based on this, introduce the Debiased Outcome-adapted Propensity Estimator (DOPE) for efficient estimation of treatment effects. An instance of DOPE is implemented using neural networks, and we demonstrate its performance on simulated and real data. The fourth and final chapter introduces a model-free measure of the conditional association between an exposure and a time-to-event, which we call the Aalen Covariance Measure (ACM). We develop a model-free estimation method and show that it is doubly robust, ensuring $\sqrt{n}$-consistency provided that the nuisance functions can be estimated with modest rates. A simulation study demonstrates the use of our estimator in several settings.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.92)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Energy (0.67)
Online Data Collection for Efficient Semiparametric Inference
Gupta, Shantanu, Lipton, Zachary C., Childers, David
While many works have studied statistical data fusion, they typically assume that the various datasets are given in advance. However, in practice, estimation requires difficult data collection decisions like determining the available data sources, their costs, and how many samples to collect from each source. Moreover, this process is often sequential because the data collected at a given time can improve collection decisions in the future. In our setup, given access to multiple data sources and budget constraints, the agent must sequentially decide which data source to query to efficiently estimate a target parameter. We formalize this task using Online Moment Selection, a semiparametric framework that applies to any parameter identified by a set of moment conditions. Interestingly, the optimal budget allocation depends on the (unknown) true parameters. We present two online data collection policies, Explore-then-Commit and Explore-then-Greedy, that use the parameter estimates at a given time to optimally allocate the remaining budget in the future steps. We prove that both policies achieve zero regret (assessed by asymptotic MSE) relative to an oracle policy. We empirically validate our methods on both synthetic and real-world causal effect estimation tasks, demonstrating that the online data collection policies outperform their fixed counterparts.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (2 more...)
Apriori_Goal algorithm for constructing association rules for a database with a given classification
An efficient algorithm, Apriori_Goal, is proposed for constructing association rules for a relational database with a given classification. The algorithm's features are related to the specifics of the database and the method of encoding its records. The algorithm proposes five criteria that characterize the quality of the rules being constructed. Different criteria are also proposed for filtering the sets used when constructing association rules. The proposed method of encoding records allows for an efficient implementation of the basic operation underlying the computation of rule characteristics. The algorithm works with a relational database, where the columns can be of different types, both continuous and discrete. Among the columns, a target discrete column is distinguished, which defines the classification of the records. This allows the original database to be divided into $n$ subsets according to the number of categories of the target parameter. A classical example of such databases is medical databases, where the target parameter is the diagnosis established by doctors. A preprocessor, which is an important part of the algorithm, converts the properties of the objects represented by the columns of the original database into binary properties and encodes each record as a single integer. In addition to saving memory, the proposed format allows the complete preservation of information about the binary properties representing the original record. More importantly, the computationally intensive operations on records, required for calculating rule characteristics, are performed almost instantly in this format using a pair of logical operations on integers.
- North America > United States > Florida > Brevard County > Melbourne (0.04)
- Europe > United Kingdom > England > East Sussex > Brighton (0.04)
- Europe > Russia > Central Federal District > Tver Oblast > Tver (0.04)
- (3 more...)
- Workflow (0.46)
- Research Report (0.40)
VehicleSDF: A 3D generative model for constrained engineering design via surrogate modeling
Morita, Hayata, Shintani, Kohei, Yuan, Chenyang, Permenter, Frank
A main challenge in mechanical design is to efficiently explore the design space while satisfying engineering constraints. This work explores the use of 3D generative models to explore the design space in the context of vehicle development, while estimating and enforcing engineering constraints. Specifically, we generate diverse 3D models of cars that meet a given set of geometric specifications, while also obtaining quick estimates of performance parameters such as aerodynamic drag. For this, we employ a data-driven approach (using the ShapeNet dataset) to train VehicleSDF, a DeepSDF based model that represents potential designs in a latent space witch can be decoded into a 3D model. We then train surrogate models to estimate engineering parameters from this latent space representation, enabling us to efficiently optimize latent vectors to match specifications. Our experiments show that we can generate diverse 3D models while matching the specified geometric parameters. Finally, we demonstrate that other performance parameters such as aerodynamic drag can be estimated in a differentiable pipeline.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
- Information Technology > Artificial Intelligence > Natural Language > Generation (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Adaptive-TMLE for the Average Treatment Effect based on Randomized Controlled Trial Augmented with Real-World Data
van der Laan, Mark, Qiu, Sky, van der Laan, Lars
We consider the problem of estimating the average treatment effect (ATE) when both randomized control trial (RCT) data and real-world data (RWD) are available. We decompose the ATE estimand as the difference between a pooled-ATE estimand that integrates RCT and RWD and a bias estimand that captures the conditional effect of RCT enrollment on the outcome. We introduce an adaptive targeted minimum loss-based estimation (A-TMLE) framework to estimate them. We prove that the A-TMLE estimator is root-n-consistent and asymptotically normal. Moreover, in finite sample, it achieves the super-efficiency one would obtain had one known the oracle model for the conditional effect of the RCT enrollment on the outcome. Consequently, the smaller the working model of the bias induced by the RWD is, the greater our estimator's efficiency, while our estimator will always be at least as efficient as an efficient estimator that uses the RCT data only. A-TMLE outperforms existing methods in simulations by having smaller mean-squared-error and 95% confidence intervals. A-TMLE could help utilize RWD to improve the efficiency of randomized trial results without biasing the estimates of intervention effects. This approach could allow for smaller, faster trials, decreasing the time until patients can receive effective treatments.
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > Maryland > Montgomery County > Silver Spring (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer
Shirakawa, Toru, Li, Yi, Wu, Yulun, Qiu, Sky, Li, Yuxuan, Zhao, Mingduo, Iso, Hiroyasu, van der Laan, Mark
We propose Deep Longitudinal Targeted Minimum Loss-based Estimation (Deep LTMLE), a novel approach to estimate the counterfactual mean of outcome under dynamic treatment policies in longitudinal problem settings. Our approach utilizes a transformer architecture with heterogeneous type embedding trained using temporal-difference learning. After obtaining an initial estimate using the transformer, following the targeted minimum loss-based likelihood estimation (TMLE) framework, we statistically corrected for the bias commonly associated with machine learning algorithms. Furthermore, our method also facilitates statistical inference by enabling the provision of 95% confidence intervals grounded in asymptotic statistical theory. Simulation results demonstrate our method's superior performance over existing approaches, particularly in complex, long time-horizon scenarios. It remains effective in small-sample, short-duration contexts, matching the performance of asymptotically efficient estimators. To demonstrate our method in practice, we applied our method to estimate counterfactual mean outcomes for standard versus intensive blood pressure management strategies in a real-world cardiovascular epidemiology cohort study.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- (5 more...)
- Research Report > New Finding (0.86)
- Research Report > Strength Medium (0.66)
- Research Report > Experimental Study (0.66)
- Health & Medicine > Epidemiology (0.48)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)